Automatic segmentation and clustering of speech using sparse coding

نویسندگان

  • Wiehan Agenbag
  • Willie Smit
  • Thomas Niesler
چکیده

We investigate the application of sparse coding and dictionary learning to the discovery of sub-word units in speech. The ultimate goal is to generate pronunciation dictionaries that could be used for automatic speech recognition (ASR). A dictionary of sparse coding atoms is trained to code a subset of the TIMIT corpus. Some of the trained units exhibit strong correlation with specific reference phonemes. It is found that our sparse coding model does not place sufficient constraints for the activation of atoms to be temporally isolated, which rules out its direct application to speech segmentation. We also investigate the consistency with which orthographically identical utterances are coded. We find that the sparse coding model used in this study generates codes that contain too much variation for it to be useful for generating pronunciation dictionaries for ASR.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic segmentation and clustering of speech using sparse coding and metaheuristic search

We propose a constrained shift and scale invariant sparse coding model for the purpose of unsupervised segmentation and clustering of speech into acoustically relevant sub-word units for automatic speech recognition. We introduce a novel local search algorithm that iteratively improves the acoustic relevance of the automatically-determined sub-word units from a random initialization by repeated...

متن کامل

Fuzzy Clustering Approach Using Data Fusion Theory and its Application To Automatic Isolated Word Recognition

 In this paper, utilization of clustering algorithms for data fusion in decision level is proposed. The results of automatic isolated word recognition, which are derived from speech spectrograph and Linear Predictive Coding (LPC) analysis, are combined with each other by using fuzzy clustering algorithms, especially fuzzy k-means and fuzzy vector quantization. Experimental results show that the...

متن کامل

Automatic Segmentation of the Gross Tumor Volume in Prostate Carcinoma Using Fuzzy Clustering in Gallium-68 PSMA PET/CT Scan

Introduction: Modern radiotherapy (RT) techniques allow a highly precise deposition of the radiation dose in tumor. So, high conformal tumor doses can be reached while sparing critical organs at risk. Materials and Methods: This study was conducted in three phases. In the first phase; Fourteen patients with primary or recurrent prostate cancer receive Gallium-...

متن کامل

Automatic Prostate Cancer Segmentation Using Kinetic Analysis in Dynamic Contrast-Enhanced MRI

Background: Dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) provides functional information on the microcirculation in tissues by analyzing the enhancement kinetics which can be used as biomarkers for prostate lesions detection and characterization.Objective: The purpose of this study is to investigate spatiotemporal patterns of tumors by extracting semi-quantitative as well as w...

متن کامل

Traffic Scene Analysis using Hierarchical Sparse Topical Coding

Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014